84 research outputs found
An Enhanced Dataflow Analysis to Automatically Tailor Side Channel Attack Countermeasures to Software Block Ciphers
Protecting software implementations of block ciphers from side channel attacks is a significant concern to realize secure embedded computation platforms. The relevance of the issue calls for the automation of the side channel vulnerability assessment of a block cipher implementation, and the automated application of provably secure defenses. The most recent methodology in the field is an application of a specialized data-flow analysis, performed by means of the LLVM compiler framework, detecting in the AES cipher the portions of the code
amenable to key extraction via side channel analysis. The contribution of this work is an enhancement to the existing data-flow analysis which extending it to tackle any block cipher implemented in software. In particular, the extended strategy takes fully into account the data dependencies present in the key schedule of a block cipher, regardless of its complexity, to obtain consistently sound results. This paper details the analysis strategy and presents new results on the tailored application of power and electro-magnetic emission analysis countermeasures, evaluating the performances on both the ARM Cortex-M and the MIPS ISA. The experimental evaluation reports a case study on two block ciphers: the first designed to achieve a high security margin at a non-negligible computational cost, and a lightweight one. The results show that, when side-channel-protected implementations are considered, the high-security block cipher is indeed more efficient than the lightweight one
Constant weight strings in constant time: a building block for code-based post-quantum cryptosystems
Code based cryptosystems often need to encode either a message or a random bitstring into one of fixed length and fixed (Hamming) weight. The lack of an efficient and reliable bijective map presents a problem in building constructions around the said cryptosystems to attain security against active attackers. We present an efficiently computable, bijective function which yields the desired mapping. Furthermore, we delineate how the said function can be computed in constant time. We experimentally validate the effectiveness and efficiency of our approach, comparing it against the current state of the art solutions, achieving three to four orders of magnitude improvements in computation time, and validate its constant runtim
Performance and Efficiency Exploration of Hardware Polynomial Multipliers for Post-Quantum Lattice-Based Cryptosystems
The significant effort in the research and design of large-scale quantum computers has spurred a transition to post-quantum cryptographic primitives worldwide. The post-quantum cryptographic primitive standardization effort led by the US NIST has recently selected the asymmetric encryption primitive Kyber as its candidate for standardization and indicated NTRU, as a valid alternative if intellectual property issues are not solved. Finally, a more conservative alternative to NTRU, NTRUPrime was also considered as an alternate candidate, due to its design choices that remove the possibility for a large set of attacks preemptively. All the aforementioned asymmetric primitives provide good performances, and are prime choices to provide IoT devices with post-quantum confidentiality services. In this work, we present a comprehensive exploration of hardware designs for the computation of polynomial multiplications, the workhorse operation in all the aforementioned cryptosystems, with a thorough analysis of performance, compactness and efficiency. The presented designs cope with the differences in the arithmetics of polynomial rings employed by distinct cryptosystems, benefiting from configurations and optimizations that are applicable at synthesis time and/or run time. In this context, we target a use case scenario where long-term key pairs are used, such as the ones for VPNs (e.g., over IPSec), secure shell protocols and instant messaging applications. Our high-performance design variants exhibit figures of latency comparable to the ones needed for the execution of the symmetric cryptographic primitives also included in the Post-Quantum schemes. Notably, the performance figures of the designs proposed for NTRU and NTRU Prime surpass the ones described in the related literature
Parallel hardware architectures for the cryptographic Tate pairing
Identity-based cryptography uses pairing functions, which
are sophisticated bilinear maps defined on elliptic
curves. Computing pairings efficiently in software is
presently a relevant research topic. Since such functions
are very complex and slow in software, dedicated hard-
ware (HW) implementations are worthy of being stud-
ied, but presently only very preliminary research is avail-
able. This work affords the problem of designing paral-
lel dedicated HW architectures, i.e.,co-processors, for the
Tate pairing, in the case of the Duursma-Lee algorithm
in characteristic 3. Formal scheduling methodologies are
applied to carry out an extensive exploration of the archi-
tectural solution space, evaluating the obtained structures
by means of different figures of merit such as computation
time, circuit area and combinations thereof.Comparisons
with the (few) existing proposals are carried out, show-
ing that a large space exists for the efficient parallelHW
computation of pairings
A Code-specific Conservative Model for the Failure Rate of Bit-flipping Decoding of LDPC Codes with Cryptographic Applications
Characterizing the decoding failure rate of iteratively decoded Low- and
Moderate-Density Parity Check (LDPC/MDPC) codes is paramount to build
cryptosystems based on them, able to achieve indistinguishability under
adaptive chosen ciphertext attacks. In this paper, we provide a statistical
worst-case analysis of our proposed iterative decoder obtained through a simple
modification of the classic in-place bit-flipping decoder. This worst case
analysis allows both to derive the worst-case behaviour of an LDPC/MDPC code
picked among the family with the same length, rate and number of parity checks,
and a code-specific bound on the decoding failure rate. The former result
allows us to build a code-based cryptosystem enjoying the -correctness
property required by IND-CCA2 constructions, while the latter result allows us
to discard code instances which may have a decoding failure rate significantly
different from the average one (i.e., representing weak keys), should they be
picked during the key generation procedure
Supporting Concurrency and Multiple Indexes in Private Access to Outsourced Data
Data outsourcing has recently emerged as a successful solution allowing individuals and organizations to delegate data and service management to external third parties. A major challenge in the data outsourcing scenario is how to guarantee proper privacy protection against the external server. Recent promising approaches rely on the organization of data in indexing structures that use encryption and the dynamic allocation of encrypted data to physical blocks for destroying the otherwise static relationship between data and the blocks in which they are stored. However, dynamic data allocation implies the need to re-write blocks at every read access, thus requesting exclusive locks that can affect concurrency. Also, these solutions only support search conditions on the values of the attribute used for building the indexing structure.
In this paper, we present an approach that overcomes such limitations by extending the recently proposed shuffle index structure with support for concurrency and multiple indexes. Support for concurrency relies on the use of several differential versions of the data index that are periodically reconciled and applied to the main data structure. Support for multiple indexes relies on the definition of secondary shuffle indexes that are then combined with the primary index in a single data structure whose content and allocation is unintelligible to the server. We show how using such differential versions and combined index structure guarantees privacy, provides support for concurrent accesses and multiple search conditions, and considerably increases the performance of the system and the applicability of the proposed solution
Automated instantiation of side-channel attacks countermeasures for software cipher implementations
Side Channel Attacks (SCA) have proven to be a practical threat to the security of embedded systems, exploiting the information leakage coming from unintended channels concerning an implementation of a cryptographic primitive. Given the large variety of embedded platforms, and the ubiquity of the need for secure cryptographic implementations, a systematic and automated approach to deploy SCA countermeasures at design time is strongly needed. In this paper, we provide an overview of recent compiler-based techniques to protect software implementations against SCA, making them amenable to automated application in the development of secure-by-design systems
Simulation-Time Security Margin Assessment against Power-Based Side Channel Attacks
A sound design time evaluation of the security of a digital device is
a goal which has attracted a great amount of research effort lately.
Common security metrics for the attack consider either the theoretical leakage of the device, or assume as a security metric the
number of measurements needed in order to be able to always recover the secret key. In this work we provide a combined security
metric taking into account the computational effort needed to lead
the attack, in combination with the quantity of measurements to
be performed, and provide a practical lower bound for the security
margin which can be employed by a secure hardware designer. This
paper represents a first exploration of a design-time security metric
incorporating the computational effort required to lead a power-
based side channel attack in the security level assessment of the
device. We take into account in our metric the possible presence of
masking and hiding schemes, and we assume the best measurement
conditions for the attacker, thus leading to a conservative estimate
of the security of the device. We provide a practical validation of
our security metric through an analysis of transistor-level accurate
power simulations of a 128-bit AES core implemented on a 65 nm
library
Encasing Block Ciphers to Foil Key Recovery Attempts via Side Channel
Providing efficient protection against energy consumption based side channel attacks (SCAs) for block ciphers is a relevant topic for the research community, as current overheads are in the 100×
range. Unprofiled SCAs exploit information leakage from the outmost rounds of a cipher; we propose a solution encasing it between keyed transformations amenable to an efficient SCA protection. Our solution can be employed as a drop in replacement for an unprotected implementation, or be retrofit to an existing one, while retaining communication capabilities with legacy insecure endpoints. Experiments on a Cortex-M4 μC, show performance improvements
in the range of 60×, compared with available solutions
- …